completeness score
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts (0.04)
- North America > Canada (0.04)
- Leisure & Entertainment (1.00)
- Media > Film (0.68)
OpenCML: End-to-End Framework of Open-world Machine Learning to Learn Unknown Classes Incrementally
Parmar, Jitendra, Thakur, Praveen Singh
Open-world machine learning is an emerging technique in artificial intelligence, where conventional machine learning models often follow closed-world assumptions, which can hinder their ability to retain previously learned knowledge for future tasks. However, automated intelligence systems must learn about novel classes and previously known tasks. The proposed model offers novel learning classes in an open and continuous learning environment. It consists of two different but connected tasks. First, it discovers unknown classes in the data and creates novel classes; next, it learns how to perform class incrementally for each new class. Together, they enable continual learning, allowing the system to expand its understanding of the data and improve over time. The proposed model also outperformed existing approaches in open-world learning. Furthermore, it demonstrated strong performance in continuous learning, achieving a highest average accuracy of 82.54% over four iterations and a minimum accuracy of 65.87%.
- Asia > India > Madhya Pradesh (0.04)
- Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Asia > India > Maharashtra > Mumbai (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.70)
- Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Unsupervised Document and Template Clustering using Multimodal Embeddings
Sampaio, Phillipe R., Maxcici, Helene
We study unsupervised clustering of documents at both the category and template levels using frozen multimodal encoders and classical clustering algorithms. We systematize a model-agnostic pipeline that (i) projects heterogeneous last-layer states from text-layout-vision encoders into token-type-aware document vectors and (ii) performs clustering with centroid- or density-based methods, including an HDBSCAN + $k$-NN assignment to eliminate unlabeled points. We evaluate eight encoders (text-only, layout-aware, vision-only, and vision-language) with $k$-Means, DBSCAN, HDBSCAN + $k$-NN, and BIRCH on five corpora spanning clean synthetic invoices, their heavily degraded print-and-scan counterparts, scanned receipts, and real identity and certificate documents. The study reveals modality-specific failure modes and a robustness-accuracy trade-off, with vision features nearly solving template discovery on clean pages while text dominates under covariate shift, and fused encoders offering the best balance. We detail a reproducible, oracle-free tuning protocol and the curated evaluation settings to guide future work on unsupervised document organization.
- South America > Brazil > Rio Grande do Sul > Porto Alegre (0.04)
- North America > United States > North Dakota > McKenzie County (0.04)
- Europe > Switzerland (0.04)
- (3 more...)
- Media > Film (0.47)
- Leisure & Entertainment (0.47)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Leisure & Entertainment (1.00)
- Media > Film (0.68)
Addressing Leakage in Concept Bottleneck Models
Concept bottleneck models (CBMs) (Chen et al., 2020; Koh et al., 2020; Y eh et al., 2020; Lage & Doshi-V elez, 2020; Wang et al., 2017) propose explicitly aligning the intermediate layers of a The concept bottleneck model (CBM) makes its prediction in two stages. While soft concepts may improve predictive performance, this improvement comes at a cost.
Investigating the Potential of Using Large Language Models for Scheduling
The inaugural ACM International Conference on AI-powered Software introduced the AIware Challenge, prompting researchers to explore AI-driven tools for optimizing conference programs through constrained optimization. We investigate the use of Large Language Models (LLMs) for program scheduling, focusing on zero-shot learning and integer programming to measure paper similarity. Our study reveals that LLMs, even under zero-shot settings, create reasonably good first drafts of conference schedules. When clustering papers, using only titles as LLM inputs produces results closer to human categorization than using titles and abstracts with TFIDF. The code has been made publicly available.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
- South America > Brazil (0.06)
- North America > United States > New York > New York County > New York City (0.04)
Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees
Vielhaben, Johanna, Blücher, Stefan, Strodthoff, Nils
For the trustworthy application of XAI, in particular for high-stake decisions, a more global model understanding is required. To this end, concept-based methods have been proposed, which are however not guaranteed to be bound to the actual model reasoning. To circumvent this problem, we propose Multi-dimensional Concept Discovery (MCD) as an extension of previous approaches that fulfills a completeness relation on the level of concepts. Our method starts from general linear subspaces as concepts and does neither require reinforcing concept interpretability nor re-training of model parts. We propose sparse subspace clustering to discover improved concepts and fully leverage the potential of multi-dimensional subspaces. MCD offers two complementary analysis tools for concepts in input space: (1) concept activation maps, that show where a concept is expressed within a sample, allowing for concept characterization through prototypical samples, and (2) concept relevance heatmaps, that decompose the model decision into concept contributions. Both tools together enable a detailed global understanding of the model reasoning, which is guaranteed to relate to the model via a completeness relation. Thus, MCD paves the way towards more trustworthy concept-based XAI. We empirically demonstrate the superiority of MCD against more constrained concept definitions.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Pennsylvania (0.04)
- Europe > France (0.04)
- Health & Medicine > Diagnostic Medicine > Imaging (0.67)
- Health & Medicine > Nuclear Medicine (0.46)
On Concept-Based Explanations in Deep Neural Networks
Yeh, Chih-Kuan, Kim, Been, Arik, Sercan O., Li, Chun-Liang, Ravikumar, Pradeep, Pfister, Tomas
Deep neural networks (DNNs) build high-level intelligence on low-level raw features. Understanding of this high-level intelligence can be enabled by deciphering the concepts they base their decisions on, as human-level thinking. In this paper, we study concept-based explainability for DNNs in a systematic framework. First, we define the notion of completeness, which quantifies how sufficient a particular set of concepts is in explaining a model's prediction behavior. Based on performance and variability motivations, we propose two definitions to quantify completeness. We show that under degenerate conditions, our method is equivalent to Principal Component Analysis. Next, we propose a concept discovery method that considers two additional constraints to encourage the interpretability of the discovered concepts. We use game-theoretic notions to aggregate over sets to define an importance score for each discovered concept, which we call ConceptSHAP. On specifically-designed synthetic datasets and real-world text and image datasets, we validate the effectiveness of our framework in finding concepts that are complete in explaining the decision, and interpretable.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)